Let’s call the help function of library and see what happens.
help(library)
You can add a new chunk of code by clicking the Insert Chunk button on the toolbar or by pressing Cmd+Option+I.
When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Cmd+Shift+K to preview the HTML file).
The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.
Let’s take a quick look at how are notebook looks like right now.
Tibble - Tibbles as dataframes
Tidyr
Regressions:
Output:
Let’s start with Tidyr and import them into our notebook
#install.packages("tidyverse","dplyr","wdi")
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.1.6 ✓ stringr 1.4.0
## ✓ tidyr 1.1.3 ✓ forcats 0.5.1
## ✓ readr 1.4.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(WDI)
R follows a set of conventions that makes one layout of tabular data much easier to work with. We usually call this format “Long”, and it will always have:
So, generally, you’re trying to get your data into that point. Luckily, there are a few commands to help you with this.
There are lots of ways to do this, though: reshape(), melt() [from wide to long], cast() [from long to wide], gather()[from wide to long], spread()[from long to wide]
But, we’ll learn to transform data with pivot_longer and pivot_wider
Let’s import the dataset that we’ll use today - growth rate by country - using the R package WDI. Be sure to install packages as install.packages(WDI)
wdi <- WDI(country = "all", start=2000, end=2015, extra="TRUE",
indicator=c("NY.GDP.MKTP.KD.ZG"))
wdi
Now, let’s try to pivot our data to the wide format.
When we pivot wider, it’s going to look something like this:
wdi_wide <- wdi %>% pivot_wider(names_from=year,values_from=NY.GDP.MKTP.KD.ZG)
wdi_wide
We can transform the data into long format with the function pivot_longer.
It’ll look something like this:
Before I pivot, I’m going to check the names of my columns to see what columns I need to transform.
names(wdi_wide)
## [1] "iso2c" "country" "iso3c" "region" "capital" "longitude"
## [7] "latitude" "income" "lending" "2009" "2008" "2005"
## [13] "2004" "2007" "2015" "2013" "2003" "2006"
## [19] "2010" "2002" "2014" "2012" "2011" "2000"
## [25] "2001"
names(wdi_wide) # checking the names of the columns to know what I should grab
## [1] "iso2c" "country" "iso3c" "region" "capital" "longitude"
## [7] "latitude" "income" "lending" "2009" "2008" "2005"
## [13] "2004" "2007" "2015" "2013" "2003" "2006"
## [19] "2010" "2002" "2014" "2012" "2011" "2000"
## [25] "2001"
pivot_longer(wdi_wide, c('2009':'2001'), names_to='year',values_to='gdp')
Another handy thing to learn in R is separate
let’s say you have dates that you want to create two columns with. You can use the function separate to help you create two distinct columns in R.
stocks = read.csv("all_stocks_5yr.csv")
date_sep <- stocks %>% separate(date, into = c("year", "month", "day"), sep = '-')
#this arrows denote assigning an object - the direction of the arrow matters, but you can see the placement before or after the code is equivalent. The code below works exactly the same as the code above.
stocks %>% separate(date, into = c("year", "month", "day"), sep = '-') -> dat_sep
date_sep
Separates sibling is Unite - where you can combine two columns into one.
date_sep %>% unite(date, c("month", "day", "year"), sep = "/")
Now Let’s get to running those regressions
The general format is that you will specify the model as the function and inside that function you will define the regression model that you want to run.
Stata’s “reg” is R’s “lm” which stands for linear model and is at the core of regression analysis.
The model will look something like this: lm(y ~ x1 + x2 + x3 + ..., data = df)
You can also call each of the columns within a dataframe like this: lm(df$y ~ df$x1 + df$x2 + df$x3 + ...) But, I prefer simplicity
Let’s try running a basic OLS regression with our jobs dataset.
jobs = read.csv('/Users/mkaltenberg/Documents/GitHub/Data_Analysis_Python_R/Becoming a Viz Kid/job-automation-probability.csv')
ols <- lm(prob ~ average_ann_wage + numbEmployed , data = jobs)
Alright, so we got a regression!
We can view some of the results in the stored item on the left. Or let’s look into it with a function summary()
summary(ols)
##
## Call:
## lm(formula = prob ~ average_ann_wage + numbEmployed, data = jobs)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.72928 -0.24991 0.06363 0.24745 0.92316
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.213e-01 2.622e-02 35.130 <2e-16 ***
## average_ann_wage -6.971e-06 4.031e-07 -17.293 <2e-16 ***
## numbEmployed 1.168e-08 2.761e-08 0.423 0.672
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3078 on 699 degrees of freedom
## Multiple R-squared: 0.3029, Adjusted R-squared: 0.3009
## F-statistic: 151.8 on 2 and 699 DF, p-value: < 2.2e-16
# if you want to remove a particular object, you can use the funciton rm()
rm(date_sep)
That’s better! Ok, so, we can see all of our general statistics here. We can also view specific parts by using the dollar sign to indicate a part of the output we want to view
summary(ols)$coefficients
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.212632e-01 2.622403e-02 35.1304979 1.484084e-156
## average_ann_wage -6.970858e-06 4.030936e-07 -17.2933997 4.776142e-56
## numbEmployed 1.167886e-08 2.761272e-08 0.4229523 6.724601e-01
You can run a subset of the data utilizing filter and grepl. NOTE the difference in the parenthesis.
The e quivalent in stata would be: reg prob average_ann_wage numbEmployed if education != “No formal educational credential”
ols2 <- lm(prob ~ average_ann_wage + numbEmployed , data = jobs %>% filter(!(grepl("No formal educational credential", education))))
summary(ols2)
##
## Call:
## lm(formula = prob ~ average_ann_wage + numbEmployed, data = jobs %>%
## filter(!(grepl("No formal educational credential", education))))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.69748 -0.27219 0.03687 0.27719 0.90587
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.863e-01 3.016e-02 29.383 <2e-16 ***
## average_ann_wage -6.556e-06 4.447e-07 -14.743 <2e-16 ***
## numbEmployed -1.239e-08 4.216e-08 -0.294 0.769
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3202 on 601 degrees of freedom
## Multiple R-squared: 0.2656, Adjusted R-squared: 0.2632
## F-statistic: 108.7 on 2 and 601 DF, p-value: < 2.2e-16
This is the same as this format (reminder of pipes and how we can use them)
jobs_noedu =
jobs %>%
filter(education != "No formal educational credential")
# filter(!(grepl("No formal educational credential", name))) ## This also works
ols2 <- lm(prob ~ average_ann_wage + numbEmployed , data = jobs_noedu)
summary(ols2)
##
## Call:
## lm(formula = prob ~ average_ann_wage + numbEmployed, data = jobs_noedu)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.69748 -0.27219 0.03687 0.27719 0.90587
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.863e-01 3.016e-02 29.383 <2e-16 ***
## average_ann_wage -6.556e-06 4.447e-07 -14.743 <2e-16 ***
## numbEmployed -1.239e-08 4.216e-08 -0.294 0.769
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3202 on 601 degrees of freedom
## Multiple R-squared: 0.2656, Adjusted R-squared: 0.2632
## F-statistic: 108.7 on 2 and 601 DF, p-value: < 2.2e-16
Often (like REALLY REALLY often) we want to include robust standard errors. In stata, this is an option in the command reg.
White Standard Error Adjustment (Robust standard errors) in Stata: reg prob average_ann_wage numbEmployed, r
In R, it’s an entirely new function. So, let’s import the package estimates
library(estimatr)
ols1_robust = lm_robust(prob ~ average_ann_wage + numbEmployed , data = jobs)
summary(ols1_robust)
##
## Call:
## lm_robust(formula = prob ~ average_ann_wage + numbEmployed, data = jobs)
##
## Standard error type: HC2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) CI Lower
## (Intercept) 9.213e-01 2.860e-02 32.2078 3.044e-140 8.651e-01
## average_ann_wage -6.971e-06 4.705e-07 -14.8151 2.182e-43 -7.895e-06
## numbEmployed 1.168e-08 2.046e-08 0.5708 5.683e-01 -2.849e-08
## CI Upper DF
## (Intercept) 9.774e-01 699
## average_ann_wage -6.047e-06 699
## numbEmployed 5.185e-08 699
##
## Multiple R-squared: 0.3029 , Adjusted R-squared: 0.3009
## F-statistic: 113.5 on 2 and 699 DF, p-value: < 2.2e-16
Another thing you may want to do is include a dummy variable in your regression.
Generally, we consider this factors. In stata, you can include factors as i.dummy_variable
R makes this pretty easy - it automatically knows that you are using a string variable and will create categorical variables (factors) out of that.
ols_edu = lm_robust(prob ~ average_ann_wage + numbEmployed + factor(education), data = jobs)
summary(ols_edu)
##
## Call:
## lm_robust(formula = prob ~ average_ann_wage + numbEmployed +
## factor(education), data = jobs)
##
## Standard error type: HC2
##
## Coefficients:
## Estimate Std. Error
## (Intercept) 6.311e-01 5.882e-02
## average_ann_wage -2.834e-06 5.229e-07
## numbEmployed -5.931e-09 2.179e-08
## factor(education)Bachelor's degree -1.842e-01 5.631e-02
## factor(education)Doctoral or professional degree -1.834e-01 7.130e-02
## factor(education)High school diploma or equivalent 1.998e-01 5.262e-02
## factor(education)Master's degree -2.956e-01 6.292e-02
## factor(education)No formal educational credential 2.471e-01 5.597e-02
## factor(education)Postsecondary nondegree award -1.187e-02 6.813e-02
## factor(education)Some college, no degree 1.619e-01 1.297e-01
## t value Pr(>|t|) CI Lower
## (Intercept) 10.7300 5.952e-25 5.156e-01
## average_ann_wage -5.4190 8.281e-08 -3.860e-06
## numbEmployed -0.2722 7.856e-01 -4.872e-08
## factor(education)Bachelor's degree -3.2710 1.125e-03 -2.947e-01
## factor(education)Doctoral or professional degree -2.5728 1.030e-02 -3.234e-01
## factor(education)High school diploma or equivalent 3.7970 1.593e-04 9.649e-02
## factor(education)Master's degree -4.6982 3.167e-06 -4.192e-01
## factor(education)No formal educational credential 4.4160 1.166e-05 1.373e-01
## factor(education)Postsecondary nondegree award -0.1743 8.617e-01 -1.456e-01
## factor(education)Some college, no degree 1.2484 2.123e-01 -9.274e-02
## CI Upper DF
## (Intercept) 7.466e-01 692
## average_ann_wage -1.807e-06 692
## numbEmployed 3.685e-08 692
## factor(education)Bachelor's degree -7.363e-02 692
## factor(education)Doctoral or professional degree -4.344e-02 692
## factor(education)High school diploma or equivalent 3.031e-01 692
## factor(education)Master's degree -1.721e-01 692
## factor(education)No formal educational credential 3.570e-01 692
## factor(education)Postsecondary nondegree award 1.219e-01 692
## factor(education)Some college, no degree 4.166e-01 692
##
## Multiple R-squared: 0.4489 , Adjusted R-squared: 0.4417
## F-statistic: 93.9 on 9 and 692 DF, p-value: < 2.2e-16
There is a convenient syntax on how to include interaction terms:
x1:x2 “crosses” the variables (equivalent to including only the x1 × x2 interaction term)
x1/x2 “nests” the second variable within the first (equivalent to x1 + x1:x2)
x1*x2 includes all parent and interaction terms (equivalent to x1 + x2 + x1:x2)
Almost always you’ll use * (all parent and interaction terms). There are situations in which you wouldn’t, but it’s rare.
So, let’s check it out:
int = lm_robust(prob ~ average_ann_wage + numbEmployed*education, data = jobs)
summary(int)
##
## Call:
## lm_robust(formula = prob ~ average_ann_wage + numbEmployed *
## education, data = jobs)
##
## Standard error type: HC2
##
## Coefficients:
## Estimate Std. Error
## (Intercept) 6.664e-01 7.856e-02
## average_ann_wage -2.847e-06 5.350e-07
## numbEmployed -5.138e-07 8.928e-07
## educationBachelor's degree -2.247e-01 7.686e-02
## educationDoctoral or professional degree -2.147e-01 9.046e-02
## educationHigh school diploma or equivalent 1.742e-01 7.376e-02
## educationMaster's degree -2.848e-01 8.770e-02
## educationNo formal educational credential 2.022e-01 7.617e-02
## educationPostsecondary nondegree award -4.516e-02 9.087e-02
## educationSome college, no degree 8.691e-03 1.601e-01
## numbEmployed:educationBachelor's degree 5.455e-07 8.975e-07
## numbEmployed:educationDoctoral or professional degree 4.856e-07 9.151e-07
## numbEmployed:educationHigh school diploma or equivalent 4.514e-07 8.946e-07
## numbEmployed:educationMaster's degree -1.708e-07 9.778e-07
## numbEmployed:educationNo formal educational credential 5.334e-07 8.929e-07
## numbEmployed:educationPostsecondary nondegree award 4.995e-07 9.250e-07
## numbEmployed:educationSome college, no degree 6.669e-07 9.110e-07
## t value Pr(>|t|)
## (Intercept) 8.48369 1.343e-16
## average_ann_wage -5.32124 1.398e-07
## numbEmployed -0.57550 5.651e-01
## educationBachelor's degree -2.92354 3.575e-03
## educationDoctoral or professional degree -2.37356 1.789e-02
## educationHigh school diploma or equivalent 2.36178 1.847e-02
## educationMaster's degree -3.24796 1.219e-03
## educationNo formal educational credential 2.65441 8.129e-03
## educationPostsecondary nondegree award -0.49705 6.193e-01
## educationSome college, no degree 0.05428 9.567e-01
## numbEmployed:educationBachelor's degree 0.60777 5.435e-01
## numbEmployed:educationDoctoral or professional degree 0.53070 5.958e-01
## numbEmployed:educationHigh school diploma or equivalent 0.50455 6.140e-01
## numbEmployed:educationMaster's degree -0.17471 8.614e-01
## numbEmployed:educationNo formal educational credential 0.59745 5.504e-01
## numbEmployed:educationPostsecondary nondegree award 0.53996 5.894e-01
## numbEmployed:educationSome college, no degree 0.73201 4.644e-01
## CI Lower CI Upper
## (Intercept) 5.122e-01 8.207e-01
## average_ann_wage -3.898e-06 -1.797e-06
## numbEmployed -2.267e-06 1.239e-06
## educationBachelor's degree -3.756e-01 -7.380e-02
## educationDoctoral or professional degree -3.923e-01 -3.710e-02
## educationHigh school diploma or equivalent 2.938e-02 3.190e-01
## educationMaster's degree -4.570e-01 -1.126e-01
## educationNo formal educational credential 5.263e-02 3.518e-01
## educationPostsecondary nondegree award -2.236e-01 1.332e-01
## educationSome college, no degree -3.057e-01 3.231e-01
## numbEmployed:educationBachelor's degree -1.217e-06 2.308e-06
## numbEmployed:educationDoctoral or professional degree -1.311e-06 2.282e-06
## numbEmployed:educationHigh school diploma or equivalent -1.305e-06 2.208e-06
## numbEmployed:educationMaster's degree -2.091e-06 1.749e-06
## numbEmployed:educationNo formal educational credential -1.220e-06 2.286e-06
## numbEmployed:educationPostsecondary nondegree award -1.317e-06 2.316e-06
## numbEmployed:educationSome college, no degree -1.122e-06 2.456e-06
## DF
## (Intercept) 685
## average_ann_wage 685
## numbEmployed 685
## educationBachelor's degree 685
## educationDoctoral or professional degree 685
## educationHigh school diploma or equivalent 685
## educationMaster's degree 685
## educationNo formal educational credential 685
## educationPostsecondary nondegree award 685
## educationSome college, no degree 685
## numbEmployed:educationBachelor's degree 685
## numbEmployed:educationDoctoral or professional degree 685
## numbEmployed:educationHigh school diploma or equivalent 685
## numbEmployed:educationMaster's degree 685
## numbEmployed:educationNo formal educational credential 685
## numbEmployed:educationPostsecondary nondegree award 685
## numbEmployed:educationSome college, no degree 685
##
## Multiple R-squared: 0.4527 , Adjusted R-squared: 0.4399
## F-statistic: 56.23 on 16 and 685 DF, p-value: < 2.2e-16
There are a LOT of packages out there to export and display various statistical results. You can take a look at them here
There are very few that export directly to word (flextable and huxtable) - rather, you would export html output and then copy and paste it into word or for a presentation or export a pdf from R markdown for a presentation. The reason that there are such few packages that export to word is because it’s a nightmare of bugs. Also, most researchers using R/Python use LaTex for formatting, thus most packages export to LaTex quite easily.
We will focus on two of them that make output in a variety of formats easy and pretty - stargazer and modelsummary.
Most people who use R for data analysis use stargazer as a way to present regression tables and summary statistics. It is by far the most popular package for this - possibly because it’s so easy to use the output looks very nice
##
## My Data Models
## ===========================================================================================
## Dependent variable:
## ------------------------------------------------
## prob
## (1) (2)
## -------------------------------------------------------------------------------------------
## average_ann_wage -0.00001***
## (0.00000)
##
## educationBachelor's degree -0.202***
## (0.049)
##
## educationDoctoral or professional degree -0.216***
## (0.080)
##
## educationHigh school diploma or equivalent 0.201***
## (0.045)
##
## educationMaster's degree -0.306***
## (0.067)
##
## educationNo formal educational credential 0.250***
## (0.053)
##
## educationPostsecondary nondegree award -0.013
## (0.060)
##
## educationSome college, no degree 0.142
## (0.146)
##
## numbEmployed 0.000 -0.000
## (0.00000) (0.00000)
##
## median_ann_wage -0.00000***
## (0.00000)
##
## Constant 0.921*** 0.613***
## (0.026) (0.052)
##
## -------------------------------------------------------------------------------------------
## Observations 702 702
## R2 0.303 0.444
## Adjusted R2 0.301 0.436
## Residual Std. Error 0.308 (df = 699) 0.276 (df = 692)
## F Statistic 151.845*** (df = 2; 699) 61.304*** (df = 9; 692)
## ===========================================================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
We can export the table into html with the option “out” - let’s check out what the files looks like after we export it.
##
## <table style="text-align:center"><caption><strong>My models</strong></caption>
## <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td colspan="2"><em>Dependent variable:</em></td></tr>
## <tr><td></td><td colspan="2" style="border-bottom: 1px solid black"></td></tr>
## <tr><td style="text-align:left"></td><td colspan="2">prob</td></tr>
## <tr><td style="text-align:left"></td><td>(1)</td><td>(2)</td></tr>
## <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">average_ann_wage</td><td>-0.00001<sup>***</sup></td><td></td></tr>
## <tr><td style="text-align:left"></td><td>(0.00000)</td><td></td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">educationBachelor's degree</td><td></td><td>-0.202<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(0.049)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">educationDoctoral or professional degree</td><td></td><td>-0.216<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(0.080)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">educationHigh school diploma or equivalent</td><td></td><td>0.201<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(0.045)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">educationMaster's degree</td><td></td><td>-0.306<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(0.067)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">educationNo formal educational credential</td><td></td><td>0.250<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(0.053)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">educationPostsecondary nondegree award</td><td></td><td>-0.013</td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(0.060)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">educationSome college, no degree</td><td></td><td>0.142</td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(0.146)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">numbEmployed</td><td>0.000</td><td>-0.000</td></tr>
## <tr><td style="text-align:left"></td><td>(0.00000)</td><td>(0.00000)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">median_ann_wage</td><td></td><td>-0.00000<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(0.00000)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Constant</td><td>0.921<sup>***</sup></td><td>0.613<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(0.026)</td><td>(0.052)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>702</td><td>702</td></tr>
## <tr><td style="text-align:left">R<sup>2</sup></td><td>0.303</td><td>0.444</td></tr>
## <tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.301</td><td>0.436</td></tr>
## <tr><td style="text-align:left">Residual Std. Error</td><td>0.308 (df = 699)</td><td>0.276 (df = 692)</td></tr>
## <tr><td style="text-align:left">F Statistic</td><td>151.845<sup>***</sup> (df = 2; 699)</td><td>61.304<sup>***</sup> (df = 9; 692)</td></tr>
## <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td colspan="2" style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr>
## </table>
We can easily see the regression when we directly input the code in R Markdown
There are many options within stargazer that we can play around to get our tables “just right” - and you will spend a lot of time doing this. Every researcher spends more time than they like to get their tables correct.
A very useful cheat sheet on stargazer can be found here
##
## % Error: Unrecognized object type.
## % Error: Unrecognized object type.
Model Summary is a quick an easy way to export and view regression tables.
| Model 1 | Model 2 | Model 3 | |
|---|---|---|---|
| (Intercept) | 0.921 | 0.666 | 0.631 |
| (0.026) | (0.079) | (0.059) | |
| average_ann_wage | 0.000 | 0.000 | 0.000 |
| (0.000) | (0.000) | (0.000) | |
| numbEmployed | 0.000 | 0.000 | 0.000 |
| (0.000) | (0.000) | (0.000) | |
| educationBachelor’s degree | -0.225 | ||
| (0.077) | |||
| educationDoctoral or professional degree | -0.215 | ||
| (0.090) | |||
| educationHigh school diploma or equivalent | 0.174 | ||
| (0.074) | |||
| educationMaster’s degree | -0.285 | ||
| (0.088) | |||
| educationNo formal educational credential | 0.202 | ||
| (0.076) | |||
| educationPostsecondary nondegree award | -0.045 | ||
| (0.091) | |||
| educationSome college, no degree | 0.009 | ||
| (0.160) | |||
| numbEmployed × educationBachelor’s degree | 0.000 | ||
| (0.000) | |||
| numbEmployed × educationDoctoral or professional degree | 0.000 | ||
| (0.000) | |||
| numbEmployed × educationHigh school diploma or equivalent | 0.000 | ||
| (0.000) | |||
| numbEmployed × educationMaster’s degree | 0.000 | ||
| (0.000) | |||
| numbEmployed × educationNo formal educational credential | 0.000 | ||
| (0.000) | |||
| numbEmployed × educationPostsecondary nondegree award | 0.000 | ||
| (0.000) | |||
| numbEmployed × educationSome college, no degree | 0.000 | ||
| (0.000) | |||
| factor(education)Bachelor’s degree | -0.184 | ||
| (0.056) | |||
| factor(education)Doctoral or professional degree | -0.183 | ||
| (0.071) | |||
| factor(education)High school diploma or equivalent | 0.200 | ||
| (0.053) | |||
| factor(education)Master’s degree | -0.296 | ||
| (0.063) | |||
| factor(education)No formal educational credential | 0.247 | ||
| (0.056) | |||
| factor(education)Postsecondary nondegree award | -0.012 | ||
| (0.068) | |||
| factor(education)Some college, no degree | 0.162 | ||
| (0.130) | |||
| Num.Obs. | 702 | 702 | 702 |
| R2 | 0.303 | 0.453 | 0.449 |
| R2 Adj. | 0.301 | 0.440 | 0.442 |
| AIC | 342.9 | ||
| BIC | 361.2 | ||
| Log.Lik. | -167.471 | ||
| F | 151.845 | ||
| se_type | HC2 | HC2 |
And summary statistics:
| Unique (#) | Missing (%) | Mean | SD | Min | Median | Max | ||
|---|---|---|---|---|---|---|---|---|
| NY.GDP.MKTP.KD.ZG | 3987 | 6 | 3.9 | 5.2 | -62.1 | 4.0 | 123.1 | |
| year | 16 | 0 | 2007.5 | 4.6 | 2000 | 2007.5 | 2015 |
## Loading required package: kableExtra
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
| Name | Class | Label | Values |
|---|---|---|---|
| iso2c | character | NULL | |
| country | character | NULL | |
| NY.GDP.MKTP.KD.ZG | numeric | GDP growth (annual %) | Num: -62.076 to 123.14 |
| year | integer | NULL | Num: 2000 to 2015 |
| iso3c | character | NULL | |
| region | character | NULL | |
| capital | character | NULL | |
| longitude | character | NULL | |
| latitude | character | NULL | |
| income | character | NULL | |
| lending | character | NULL |
There are also other options that exist with datasummary that you can check out here